The objective of this report is to give an detailed description of individual steps of the data analysis

Pollen data aquisition

FOSSILPOL

The compilation of a pollen dataset for our analysis is performed a priori. Raw pollen datasets are carefully selected with the R-Fossilpol package, and the guidelines to the workflow are well described in Flantua et al. 2023 and in our website Fossilpol project. Most datasets are obtained from the Neotoma Paleoecology Database. Some additional data are from private owners in areas with data gaps, which we have limited access to use. We do not have the intellectual property rights to make these data public available. Therefore, only the derivatives of the analysis can be publicly shared. Table 1 provides a summary of the settings used in the FOSSILPOL workflow to obtain a standardised project dataset. This input data are further filter during the data processing steps in HOPE to get the final collection of a standardized dataset of high data quality we can use further in our data analyses.

Harmonisation tables

An important step in FOSSILPOL to obtain a standardised pollen data set within and across regions is the harmonisation of pollen types. Different analysts have different backgrounds and schools using different nomenclature, and the level of pollen taxonomic identifications and names can vary widely. To be able to make numerical comparisons of different pollen records, the level of pollen taxonomy should be similar. Consequently, pollen harmonisation tables have been produced for different regions to try to minimise biases related to this. The regional harmonisation tables created in our project are for Europe, Levant, Siberia, Southern Asia, Northern America, Latin America, and the Indo-Pacific region (Birks et al. harmonisation paper). These tables are used as input in the Fossilpol workflow above (see Fossilpol step_by_step guide).

Data pollen assembly

We have applied a number of filtering criteria to obtain as high a data quality as possible so that we can compare the numerical estimates on standardised data sets. These filtering criteria are: remove potentially duplicated pollen records, sorting levels (samples) by age, remove levels (samples) lower than a threshold of total number pollen grains counted (= pollen sum), remove pollen records based on age (minimum and maximum age ranges), remove levels (samples) depending on the age of the last control point, remove samples beyond the age ranges of interest, and remove pollen records if the total number of samples (N) is too low.

This filtering is done on the chronologies, raw pollen counts, harmonised pollen counts, and the age uncertainties from the age-depth models (Bchron). The preferable number of minimum pollen grains is set to 150, but this led to a great loss of datasets in regions with less data coverage, and we therefore reduced this number to 25 with the condition that less than 50 % of the samples must have a low pollen sum. This allow us to keep more datasets, but in the cases pollen records have a low minimum pollen sum, we acknowledge that the estimates of pollen assemblage properties (PAPs) are less robust. The maximum age beyond extrapolation is set to 3000 years because ages extrapolated beyond this threshold is considered highly uncertain. Finally, pollen records with less than 5 samples are removed for further analyses.

Example record

Let’s select a one record and use it as an example.

Detection of past human presence

To determine the impact of past humans on fundamental ecosystem properties, we need to develop indicators of past human presence and activity. This led to the development of a new method, where we use human event detection and indicators identified from pollen records based on expert knowledge, combined with the method for quantifying human presence based on radiocarbon dates derived from archaeological artifacts and Summed Probability Densities (SPD) (Bird et al. 2022). We believe that this solves the issue that we can use a standardised variable as indicator of past human impact, and partially avoids the difficulty of creating standardised variables to detect human disturbance events across different regions and continents. This may reduce the potential circularity of human detection events derived from the same pollen records as the estimates of ecosystem properties.

Detection of human events

For each pollen record, we have detected periods of human presence from the pollen data. Two methods have been used:

  1. detection from pollen diagrams (North America, Europe, Asia, Indopacific)
  2. detection using indicator taxa (Latin America).

Detection of human events in pollen diagrams

First, a pollen diagram of each pollen record has been examined by a regional expert and the age of each event type has been recorded.

#> 
#> ------------------------------------------------
#>     Region                 Event.type           
#> --------------- --------------------------------
#>  North America   BI = Before Impact; FC = First 
#>                    Cultivation; ES = European   
#>                            Settlement           
#> 
#>     Europe       BI = Before Impact; FI = First 
#>                     Indication; FCu = First     
#>                   Cultivation; EC = Extensive   
#>                     Clearance; CC = Complete    
#>                            Clearance            
#> 
#>      Asia        BI = Before Impact; FI = First 
#>                     Indication; FCu = First     
#>                   Cultivation; EI = Extensive   
#>                              Impact             
#> 
#>   Indopasific    no_impact = No Impact; weak =  
#>                   Weak Impact; medium = Medium  
#>                  Impact; strong = Strong Impact 
#> ------------------------------------------------
#> 
#> Table: Table 2: Type of human events identified in pollen diagrams

Note that the event types are uniquely defined within continents, and event types with the same name have different meanings between continents.

Second, an algorithm is made to obtain the binary variables (0/1) associated with each event type that is identified in each pollen record. A new vector with the average ages in between levels (samples) of the identified event type was created because the time of the events is assumed to have occurred prior to the changed event. A new matrix was created that uses the new age vector with the events type. The different event types are assigned binary values (0/ 1) depending on if the event type is present (i.e. when the age of the specific event is detected). Ultimately, logical principles were followed and the binary values were adjusted to get the event data for each pollen record. If no human event is recognised, it does not necessarily mean that humans were absent, but instead that there was not enough information to identify human activity in the pollen records.

Below is a figure that intends to represent the data associated with human events in example record. The different colours represent the differnet types of human events in each region. The smooth trend lines represent simple binomial GAM models that show the main trends and alteration in the timing of events over time.

Arcehological artefacts and Summed Probability Densities

The quantification of summed probability distribution (SPD). requires a distance to be selected around each site location to collect the relevant dates of archaeological artefacts around it. This will limit the area of human presence and indirectly the amount of human activity relevant to pollen records from each site.

We used the global dataset of radicarbon dates (RC dates) of archeological artefacts from Bird et al. 2022.

We have gathered all radiocarbon dates up to 500 km away and split them into groups (categories) by the distance to the sequence.

Only RC dates with valid geographical location (longitude and latitude), and ‘LocAccuracy’ > 0 were selected. For each pollen record, RC dates were classified by the geographical distance to the pollen record. The chosen distance classes were: 5, 25, 50, 100, 250, 500 km. For each distance category, we used all RC dates up to the maximum distance of selected category. For example, 250 contains all RC dates up to 250 km including 5, 25,…

However, distance class with less than (250 RC dates) (a Threshold) is filtered out in order to maintain robust SPD estimation.

Estimation of SPD

For each distance category, SPD is estimated using spd function from rcarbon package for each distance class for each year between a minimum threshold age and 12 ka.

Radiocarbon dates were calibrated using calibrate function from rcarbon package with appropriate calibration curves (“IntCal20”, “ShCal20”, “mixed”). Calibration curves were obtained rcarbon package and “mixed” was created using rcarbon::mixCurves function with ‘p’ = 0.5. Calibration curves were assigned by their geographical location following Hua et al., 2013.

For the selected seqeunces intcal20 have been used.

The temporal distribution of SPDs constructed by diferent distances:

Regional age cutoff

The minimum threshold ages are different for different regions and are decided based on the availability of radiocarbon dating for different regions. In general there is a bias that radiocarbon dating is rather limited on younger material where in many regions there are a lack of C14 data during the last 2000 years. Table 5 show the ages where data younger than age_from where removed.

#> 
#> --------------------------
#>     region       age_from 
#> --------------- ----------
#>     Europe         2000   
#> 
#>  Latin America     2000   
#> 
#>      Asia          2000   
#> 
#>  North America     500    
#> 
#>     Oceania        500    
#> --------------------------

Selection of the distance

In order to select the best distance, we want to know, which SPD curve is the best at explaining the changes in pollen events.

Therefore, for each distance, we calculated Redundancy Analysis (RDA) with individual event values as response and SPD as predictors.

Here is a visualisation of the data to fit:

Or in a form of table:

#> 
#> ----------------------------------------------------------------------------
#>  age    no impact   first impact   extensive clearince   complete clearince 
#> ------ ----------- -------------- --------------------- --------------------
#>  3790       0            1                  0                    0          
#> 
#>  6767       1            0                  0                    0          
#> 
#>  7015       1            0                  0                    0          
#> 
#>  7262       1            0                  0                    0          
#> 
#>  7869       1            0                  0                    0          
#> 
#>  8236       1            0                  0                    0          
#> ----------------------------------------------------------------------------
#> 
#> Table: Table continues below
#> 
#>  
#> ----------------------------------------------------------
#>  first cultivation     50        100       250      500   
#> ------------------- --------- --------- --------- --------
#>          0           0.02323   0.06117   0.2413    0.9766 
#> 
#>          0           0.01443   0.0655    0.3516    1.124  
#> 
#>          0           0.01545   0.05377   0.2674    0.9394 
#> 
#>          0           0.01738   0.04415   0.2374    0.673  
#> 
#>          0           0.01178   0.02182   0.1267    0.2665 
#> 
#>          0           0.00652   0.01761   0.08569   0.2014 
#> ----------------------------------------------------------

We can plot the individual ordianation but it is not very informative. The size and ocupancy of points corespond with their age.

R2

We can evaluate the amount of variability explained by each ordination. We then extracted adj.r2 for each SPD distnce:

Therefore we have selected 500 as the best distance and the prefered SPD curve for 4197 sequence

Selecting best distance per continent

Here is a overview of the status of records base on the presence of humna impact from pollen diagrams and RC data

Now we can apply the selection of best distance to all sequences which fullfil such criteria: 1. Have events 2. Have at least one SPD constructed

We can summary the best distance per each continent

Pollen assemblage properties (PAP) estimation

To prepare the response variables of our main pollen dataset compilation and to be able to analyse fundamental ecosystem properties, we prepared the standard estimates of pollen assemblage properties (PAP) (Bhatta et al. 2023). The PAP estimations provide different aspects of pollen assemblage diversity which includes palynological richness, diversity and evenness, compositional change and turnover, and Rate-of-Change (RoC).

These response variables are calculated using the newly developed R-Ecopol package that contain all the functions needed to estimate PAPs for our pollen data assembly. The base functions used in this package are derived from other dependency packages such as mvpart package (Therneau et al. 2014) to estimate pollen zonations with multivariate regression trees, vegan (Oksanen et al. 2022) for other mutivariate techniques and dissimilarity indices functions from iNext (Chao et al. 2014) that have been modified to extract interpolated Hill numbers based on a minimum sample size, and newly developed R functions to run DCCA using Canoco 4.5 (ter Braak xxxx) to list a few, among other, dependency packages. Additionally, R-Ratepol (Mottl 2021) is used get the estimates of RoC

Pollen richness, diversity, and evenness

The different aspects of palynological diversity are estimated using Hill’s effective species numbers N0, N1, N2, and the associated evenness ratios of N2/N1 and N1/N0. These are combined through one equation where the effective species numbers differ mainly in how the rare taxa are weighted in the parameter q:

\[^q{D} = (\sum_{i=1}^{S} p_{i}^{q})^{1/(1-q)}\]

When q is 0, rare and abundant taxa have equal weight and the number is simply the number of taxa in the sample. The equation is not possible to define for q = 1, but as it approaches 1, it is equal to the exponential of the well-known Shannon index and reports the number of equally common taxa. When q = 2, it is the same as the inverse Simpson diversity index and provides the number of equally abundant taxa with a low weight on rare taxa. The advantage of using effective species numbers is that they provide easily interpretable units and contain the doubling effect. To standardize the sample sizes, we use the rarefaction approach developed by Chao et al. These estimates are rarefied to the number of n = 150 grains, or in some cases to a lower sum (minimum n = 25). Some pollen records were only available as pollen percentages, and as the sample size is unknown, these are then rarefied to the minimum sum of percentages. The evenness ratios will be 1 if all taxa are equally abundant, and the ratios hence indicate changes in abundances between the numbers of rare, equally common, and abundant taxa.

We acknowledge that even though attempts are made to standardise richness and diversity estimates based on standard sample size, there are additional biases that are not taken into consideration such as differences in total pollen production and pollen representation (Odgaard 1998, 2001). In some cases, the total pollen sum is also too low to be considered a robust estimate, but it was a choice made on balancing loosing too much information from geographical areas with less data coverage (see data filtering above).

Compositional change

Compositional change is calculated using multivariate regression trees (MRT) with age as the constraining variable. MRT is in general a robust tool to explore and predict changes in multivariate data sets using environmental predictor variables (De’ath, Simpson and Birks 2012). This technique has been adopted in palaeoecology to detect major zones in pollen diagrams or shifts between periods of homogeneous vegetation in time (Simpson and Birks 2012). We use the pollen taxa in percentages without any data transformations as the response and the median ages derived from the age-depth model as the constraining variable. The recursive partitioning are based on chi-square distances between pollen samples constrained by time. The number of cross-validation is set to 1000, and the optimal sized tree is chosen based on the 1SD rule (Simpson and Birks 2012).

Compositional turnover

Compositional turnover is estimated using detrended canonical correspondence analysis (DCCA) with age as the explanatory variable (ter Braak and Smilauer 2007?). Changes in Weighted average (WA) sample scores (CaseR scores sensu ter Braak and Smilauer 2012) are measures of compositional turnover in standard deviation (SD) units (Birks 2007). The WA scores are regressed with time using a second-order polynomial (age+age^2) to allow more flexibility in the turnover pattern within a pollen record. Total compositional turnover is a measure of the total length of CaseR scores along the DCCA axis 1, whereas the pattern within a record is the measures between the individual samples along the DCCA axis 1. The response data are pollen percentages without any transformation to maintain the chi-square distances between samples, whereas the ages are the median ages derived from the age-depth model for each site.

Rate-of-change

Rate-of-change for the pollen assemblages in the pollen records are quantified using the novel R-Ratepol package (Mottl et al. 2021). RoC is estimated using moving windows of 500 years’ time bins of five number of windows shifts where samples are randomly selected for each bin. This approach is shown to increase the correct detection of RoC peak-points than the more traditional approaches (Mottle et al. 2021). RoC are reported as dissimilarity per 500 years using the Chord dissimilarity coefficient. Sample size is standardized in each working unit to 150 grains or the lowest number detected in each dataset. We use only the RoC scores further in the analyses.

Change-points detection and density estimates

Change-points detection of all the PAP variables are calculated using conventional regression trees (RT) for single variables with Euclidean distances. The transitions between the resulting groups (or zones) per variable is detected and saved as binary (0/1) variables. A change-point is defined as 1, where the mean ages between the two consecutive samples are used as the timing of this significant change. This is done individually for each PAP variable.

The significant change points of the richness, diversity, and evenness variables are combined into one variable, and the significant change points of compositional turnover, compositional change, and rate-of-change is combined in a second variable. The density of these two variables are calculated using a Gaussian kernel,and re-scaled to each of specific age ranges for each individual pollen record (i.e. minimum and maximum ages). To solve the boundary issue in density estimation the data is reflected to 0. We extract the interpolated values at every 500 years time step.

Equal spacing of variables

All response variables have been estimated using the harmonised pollen records for each location. To obtain estimates of equal spacing of 500 years, we used linear interpolation. In the context of the temporal analysis we analyse samples distributed in space across time, and equal time steps are necessary.

In order to choose a method of interpolation to obtain data on equal time steps, we compared generalise additive models (GAM), hierarchical generalised additive models (HGAM), and simple linear interpolation. By applying linear interpolation, we found that the correlation structure between the multivariate response variables are more similar to the original estimates without equal spacing than applying a GAM or HGAM. The GAM or HGAM models sometime showed unexpected patterns in single PAP estimate that changed these correlations. Since we cannot individually assess all single models for each of the variable in all of the records (>1000), we choose the simplest linear interpolation method. Similar issues were detected when estimating the density variables of changes points. The first approach was to estimate densities of the individual change points, and then use hierarchical generalised additive models (HGAM) to find the common pattern between the two groups representing significant changes in richness, diversity, and evenness, or change in pollen assemblages (MRT, RoC, DCCA1). As some of the models did not converge and showed inconsistent patterns we use the density estimates on the combined variables directly and extracted the interpolated values at every 500 years step.

Paleo Climate

Paleoclimate from the CHELSA-TraCE21k downscaling algorithm is downloaded from the CHELSA database (Karger et al. 2021, Karger et al. 2021). The selected bioclimatic variables are annual mean temperatures ℃ (bio1, temp_annual), minimum temperatures of coldest month ℃ (bio6, temp_cold), precipitation of warmest quarter kg m-2 quarter-1 (bio18, prec_summer) and precipitation of coldest quarter kg m-2 quarter-1 (bio19, prec_win), where we extracted climate values for the coordinates for each dataset_id retrieving the full time series of every 100 years. In addition, we downloaded the monthly climatology for daily maximum near-surface temperature K/10 (tasmin).

Numerical analysis

The whole project is relative complex, therefore we included a conceptual workflow figure (Figure 1). Note: Figure is work in progress and will be improved with graphic designer.

Figure 1
Figure 1

Numerical analysis of hypothesis 1

Hierarchical variation partitioning (hVarPart)

To test if the ecological processes have changed due to past human activity in single records, we use reduced rank multivariate regression. This is also known as distance-based redundancy analysis (db-RDA). We used the R package rdacca.hp to run hierarchial variation partitioning with several predictors. This estimates the variation per variables in different combinations to get the average variable importance independent of the order of predictors. db-RDA was performed using Gower-distances adding a constant because our response data, the PAPS, is a mixture of different units.

Depending on the type of spatial or temporal analysis for hypothesis 1, the explanatory variables are either past human impact (SPDs), palaeoclimatic variables, and/or time (see below).

SPD is the variable of main interest as it represent past human presence. The palaeoclimate is a matrix of summer precipitation, winter precipitation, annual temperatures, and winter temperatures. These are selected as we considered them most relevant to represent major differences in climatic conditions in all the regions (in respect to differences in warm, cold, dry, wet, or regions with high seasonality). Time is represented by the ages of each pollen record, however, this is more difficult to interpret. We assume age may represent time dependent changes such as natural successions and/or ecological changes due to interaction between taxa.

In the hierarchichal variation partitioning analysis, the predictor variables can be applied either as individual predictors or as groups of predictors. In our case, we run the analysis with groups of predictors. This means that the palaeoclimatic variables are included as one matrix and not assessed as individual predictors. (The overall results is not very different from using individual predictors).

The analysis is run in two different ways:

  1. to analyse spatial changes which run the hierarchical variation partitioning for single record, then summarise per continent
  2. to analyse the temporal patterns in space for each region of the 500 year time steps.

Spatial changes

We create a single model per each recods. We then use the adjusted r2 to assess the goodness-of-fit of models. Adjusted r2 is the modified version of r2 that corrects for the number of samples and predictors in the model. It is calculated using the Ezekiel formula adjr2 = 1 - (1- r2)*(n-1)/ (n-m-1), where n = number of samples in the dataset, and m is the number of variables. In vegan, these are so called semipartial r2 (Legendre et al. 2011). Negative adjusted r2 were replaced by 0 in our analysis. The assessment of a satisfactory model fit is contingent upon the underlying data characteristics. Thus, we considered the range of adjusted r2 and removed models with a total fit falling below the 5% lower quantile threshold, specifically those with a total adj r2 below 0.121.

Then we extracted the total variation of that whole model as well as varience explained by individual predictors. There are three types of variation partition per each predictor:

  • total predictor variation
  • shared predictor variation
  • unique predictor variation

We then use the total model variation to calculate the percentage of shared and unique variation explained by each predictor.

#> 
#> ------------------------------------------------------------------
#>  total_variance   predictor   predictor_unique   predictor_shared 
#> ---------------- ----------- ------------------ ------------------
#>      0.283          human          0.0717               0         
#> 
#>      0.283         climate         0.0083             0.1008      
#> 
#>      0.283          time           0.0321             0.0897      
#> ------------------------------------------------------------------
#> 
#> Table: Table: Result table of a single record for spatial hierarchichal variation partitioning (continued below)
#> 
#>  
#> -----------------------------------------------------------------------
#>  predictor_total   predictor_unique_percent   predictor_shared_percent 
#> ----------------- -------------------------- --------------------------
#>      0.0521                 25.34                        0             
#> 
#>      0.1091                 2.933                      35.62           
#> 
#>      0.1218                 11.34                       31.7           
#> -----------------------------------------------------------------------
#> 
#> Table: Table continues below
#> 
#>  
#> -------------------------
#>  predictor_total_percent 
#> -------------------------
#>           18.41          
#> 
#>           38.55          
#> 
#>           43.04          
#> -------------------------

However, for certain records, the shared variance of predictors is much larger than the total variance of the model. This caused percentages to be very high.

#> 
#> ------------------------------------------------------------------
#>  total_variance   predictor   predictor_unique   predictor_shared 
#> ---------------- ----------- ------------------ ------------------
#>      0.035          human            0                0.0523      
#> 
#>      0.035         climate           0                0.0943      
#> 
#>      0.035          time           0.0162             0.1104      
#> ------------------------------------------------------------------
#> 
#> Table: Table: Result table of a single record for spatial hierarchichal variation partitioning (continued below)
#> 
#>  
#> -----------------------------------------------------------------------
#>  predictor_total   predictor_unique_percent   predictor_shared_percent 
#> ----------------- -------------------------- --------------------------
#>         0                     0                        149.4           
#> 
#>         0                     0                        269.4           
#> 
#>      0.1266                 46.29                      315.4           
#> -----------------------------------------------------------------------
#> 
#> Table: Table continues below
#> 
#>  
#> -------------------------
#>  predictor_total_percent 
#> -------------------------
#>             0            
#> 
#>             0            
#> 
#>           361.7          
#> -------------------------

In these cases, the outliers above 100% were removed.

Summary across climate zone per continent

We then summarised the results per each climate zone within a continent.

Here is an example of the distribution of unique variance explained by humans for Europe (part of Figure 3).

We then estimate the the median value of variance explained by each predictor (unique and shared).

In adddition, we plot the distribution of other parts of variance for each predictor. This is part of side plots for each continents in Figure 2.

Figure 2

Finally, all those parts are combined in Figure 2.

Figure 2
Figure 2

Temporal variatin of case scores

This is the only text about it i can find anywhere. Can @vivian look into thit?

Records with scores constrained by humans when time and climate is partialled out that explains more than 10% of the variance plotted along time. Dashed vertical lines reflect MRT partitions, whereas no MRT split was detected for North America. These indicate potential change around 3-5 ka, whereas the r2 ranges between 0.3-0.61, and all models have a high prediction error (CV Error > 1) indicating very low prediction power

Here is an example for Europe:

#> X-Val rep : 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99  100  101  102  103  104  105  106  107  108  109  110  111  112  113  114  115  116  117  118  119  120  121  122  123  124  125  126  127  128  129  130  131  132  133  134  135  136  137  138  139  140  141  142  143  144  145  146  147  148  149  150  151  152  153  154  155  156  157  158  159  160  161  162  163  164  165  166  167  168  169  170  171  172  173  174  175  176  177  178  179  180  181  182  183  184  185  186  187  188  189  190  191  192  193  194  195  196  197  198  199  200  201  202  203  204  205  206  207  208  209  210  211  212  213  214  215  216  217  218  219  220  221  222  223  224  225  226  227  228  229  230  231  232  233  234  235  236  237  238  239  240  241  242  243  244  245  246  247  248  249  250  251  252  253  254  255  256  257  258  259  260  261  262  263  264  265  266  267  268  269  270  271  272  273  274  275  276  277  278  279  280  281  282  283  284  285  286  287  288  289  290  291  292  293  294  295  296  297  298  299  300  301  302  303  304  305  306  307  308  309  310  311  312  313  314  315  316  317  318  319  320  321  322  323  324  325  326  327  328  329  330  331  332  333  334  335  336  337  338  339  340  341  342  343  344  345  346  347  348  349  350  351  352  353  354  355  356  357  358  359  360  361  362  363  364  365  366  367  368  369  370  371  372  373  374  375  376  377  378  379  380  381  382  383  384  385  386  387  388  389  390  391  392  393  394  395  396  397  398  399  400  401  402  403  404  405  406  407  408  409  410  411  412  413  414  415  416  417  418  419  420  421  422  423  424  425  426  427  428  429  430  431  432  433  434  435  436  437  438  439  440  441  442  443  444  445  446  447  448  449  450  451  452  453  454  455  456  457  458  459  460  461  462  463  464  465  466  467  468  469  470  471  472  473  474  475  476  477  478  479  480  481  482  483  484  485  486  487  488  489  490  491  492  493  494  495  496  497  498  499  500  501  502  503  504  505  506  507  508  509  510  511  512  513  514  515  516  517  518  519  520  521  522  523  524  525  526  527  528  529  530  531  532  533  534  535  536  537  538  539  540  541  542  543  544  545  546  547  548  549  550  551  552  553  554  555  556  557  558  559  560  561  562  563  564  565  566  567  568  569  570  571  572  573  574  575  576  577  578  579  580  581  582  583  584  585  586  587  588  589  590  591  592  593  594  595  596  597  598  599  600  601  602  603  604  605  606  607  608  609  610  611  612  613  614  615  616  617  618  619  620  621  622  623  624  625  626  627  628  629  630  631  632  633  634  635  636  637  638  639  640  641  642  643  644  645  646  647  648  649  650  651  652  653  654  655  656  657  658  659  660  661  662  663  664  665  666  667  668  669  670  671  672  673  674  675  676  677  678  679  680  681  682  683  684  685  686  687  688  689  690  691  692  693  694  695  696  697  698  699  700  701  702  703  704  705  706  707  708  709  710  711  712  713  714  715  716  717  718  719  720  721  722  723  724  725  726  727  728  729  730  731  732  733  734  735  736  737  738  739  740  741  742  743  744  745  746  747  748  749  750  751  752  753  754  755  756  757  758  759  760  761  762  763  764  765  766  767  768  769  770  771  772  773  774  775  776  777  778  779  780  781  782  783  784  785  786  787  788  789  790  791  792  793  794  795  796  797  798  799  800  801  802  803  804  805  806  807  808  809  810  811  812  813  814  815  816  817  818  819  820  821  822  823  824  825  826  827  828  829  830  831  832  833  834  835  836  837  838  839  840  841  842  843  844  845  846  847  848  849  850  851  852  853  854  855  856  857  858  859  860  861  862  863  864  865  866  867  868  869  870  871  872  873  874  875  876  877  878  879  880  881  882  883  884  885  886  887  888  889  890  891  892  893  894  895  896  897  898  899  900  901  902  903  904  905  906  907  908  909  910  911  912  913  914  915  916  917  918  919  920  921  922  923  924  925  926  927  928  929  930  931  932  933  934  935  936  937  938  939  940  941  942  943  944  945  946  947  948  949  950  951  952  953  954  955  956  957  958  959  960  961  962  963  964  965  966  967  968  969  970  971  972  973  974  975  976  977  978  979  980  981  982  983  984  985  986  987  988  989  990  991  992  993  994  995  996  997  998  999  1000
#> Minimum tree sizes
#> tabmins
#>    1 
#> 1000

Figure 3

Those are then combined with the distribution of variance explained unique by humans in Figure 3.

Figure 3
Figure 3

Numerical analysis of hypothesis 2

Multidimentional shift

To test Hypothesis 2, that the interrelationships among pollen assemblage properties in response to humans, we integrated assemblage properties at each 500-year time slice within climate zones across continents. We used a principal component analysis (PCA) with standardisation of all indices.

Here is an example of PCA for Europe - Temperate Without dry season - 2000 yr slice:

To quantify the overall changes in the interrelationship among pollen assemblage properties, we employed symmetrical Procrustes sum-of squares (m2) to compare pairwise PCA changes within each time slice.

Here is a m2 matrix of all time slices to each other for Europe - Temperate Without dry season:

We can then plot the changes between subsequent time slices (diagonal+1 in the matrix) in relationship to the time. Here each point is a m2 value between 2 time slices, the full line represent GAM model and the dahsed line LOESS smoother. Here is an examle for Europe - Temperate Without dry season:

hVarPart

Subsequently, for each continent and climate zone, we used the pairwise m2 as distance matrices in hierarchical variation partitioning to quantify the importance of humans, climate, and time in explaining the multidimensional shifts among pollen assemblage properties through time. Subsequently, the explanatory variables derived from HGAMs are integrated into hierarchical variation partitioning to gauge the extent to which human activity and/or climate shifts influenced variations in the interrelationships among pollen assemblage properties.

The visualisation is the same as H1. Here is an example for Europe:

Figure 4

Finally, all those parts are combined in Figure 4.

Figure 4
Figure 4